A new front-end for classification of non-speech sounds: a study on human whistle
نویسندگان
چکیده
Speech/non-speech sound classification is an important problem in audio diarization, audio document retrieval and advanced human interfaces. The focus of this study is on the development of spectral and temporal acoustic features for speech/non-speech sound classification based on production differences in speech versus whistle. Seven timeand frequency-domain based features are investigated. Performance of the proposed feature set for the task of speech/whistle classification is evaluated at frame level. This evaluation utilizes support vector machine (SVM) models and Gaussian mixture models (GMM) for back-end classifiers. At the frame-level, the proposed front-end fusion gives an absolute performance gain of +15.0 % and +3.1 % over MFCC with SVM and GMM based classifiers, respectively. This research will benefit the development of intelligent speech interfaces for identification, recognition, and speech coding, as a preprocessing step for real world audio streams.
منابع مشابه
Auditory Representations of Speech Sounds in a Neural Model: the Role of Peripheral Processing
The categorization of speech sounds by the auditory system has been a subject of intense attention over several decades, reflecting its importance to the scientific study of speech perception and the technological development of more human-like capabilities in automatic speech recognition. In previous work, we have firmly established that a two-stage computational model can mimic important aspe...
متن کاملEffect of sound classification by neural networks in the recognition of human hearing
In this paper, we focus on two basic issues: (a) the classification of sound by neural networks based on frequency and sound intensity parameters (b) evaluating the health of different human ears as compared to of those a healthy person. Sound classification by a specific feed forward neural network with two inputs as frequency and sound intensity and two hidden layers is proposed. This process...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملClassification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملمقایسه تأثیر درمان مبتنی بر آموزش تولید با آموزش حرکات دهانی غیر گفتاری بر گفتارکودکان 6-4 ساله ی مبتلا به اختلال واجی
Objective: speech sound disorders are among the most common speech disorders in children. Non-speech oral motor exercises have long been used as a facilitative activity throughout therapy sessions for a wide variety of speech disorders by speech-language pathologists. But there are few empirical controlled data to evaluate its effectiveness. This study aimed at comparing the effects of therapeu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015